2
1
Introduction
and energy, which, like information, is another fundamental, irreducible concept. 1
Although the doctoral thesis of Shannon, one of the fathers of information theory, was
entitled “An algebra for theoretical genetics”, apart from genetics, biology remained
largely untouched by developments in information science.
One might speculate on why information was placed so firmly at the core of
molecular biology by one of its pioneers. During the preceding decade, there had
been tremendous advances in the theory of communication—the science of the trans-
mission of information. Shannon published his seminal paper on the mathematical
theory of communication only a few years before Watson and Crick’s work. In that
context, the notion of a sequence of DNA bases as message with meaning seemed
only natural, and the next major development—the establishment of the genetic code
with which the DNA sequence could be transformed into a protein sequence—was
cast very much in the language and concepts of communication theory. More puzzling
is that there was not subsequently a more vigorous interchange between the two dis-
ciplines. Probably the lack of extensive datasets and of powerful computers, which
made the necessary calculations intolerably tedious, or simply too long, provides
sufficient explanation for this neglect—and hence, now that both these requirements
(datasets and powerful computers) are being met, it is not surprising that there is
a great revival in the application of information ideas to biology. One may indeed
hope that this revival will at last lead to a real answer being advanced in response to
the vital question “what is life?” In other words, information science is perhaps the
missing discipline that, along with the physics and chemistry already being brought
to bear, is needed to answer the question.
1.1
What is Bioinformatics?
The term “bioinformatics” seems to have been first used in the mid-1980s in order to
describe the application of information science and technology in the life sciences.
The definition was at that time very general, covering everything from robotics to
artificial intelligence. Later, bioinformatics came to be somewhat prosaically defined
as “the use of computers to retrieve, process, analyse, and simulate biological infor-
mation”. An even narrower definition was “the application of information technology
to the management of biological data”. Such definitions fail to capture the centrality
of information in biology. If, indeed, information is the most fundamental concept
underlying biology and bioinformatics is the exploration of all the ramifications and
implications of that basis, then bioinformatics is excellently positioned to revive
consideration of the central question “what is life?” A more appropriate definition
of bioinformatics is, therefore, “the science of how information is generated, trans-
1 The two are, of course, intimately related. Energy may be needed to produce information and, as
Szilard showed in his exorcism of Maxwell’s demon, the judicious use of information can produce
energy.